In computational biology, de novo protein structure prediction is the task of estimating a protein's tertiary structure from its sequence alone. The problem is very difficult and has occupied leading scientists for decades. Research has focused in three areas: alternate lower-resolution representations of proteins, accurate energy functions, and efficient sampling methods. At present, the most successful methods have a reasonable probability of predicting the fold of a small protein domain within 5 angstroms. [1]
De novo protein structure prediction methods attempt to predict tertiary structures from sequences based on general principles that govern protein folding energetics and/or statistical tendencies of conformational features that native structures acquire, without the use of explicit templates. A general paradigm for de novo prediction involves sampling conformation space, guided by scoring functions and other sequence-dependent biases such that a large set of candidate (“decoy") structures are generated. Native-like conformations are then selected from these decoys using scoring functions as well as conformer clustering. High-resolution refinement is sometimes used as a final step to fine-tune native-like structures. There are two major classes of scoring functions. Physics-based functions are based on mathematical models describing aspects of the known physics of molecular interaction. Knowledge-based functions are formed with statistical models capturing aspects of the properties of native protein conformations [2].
De novo methods tend to require vast computational resources, and have thus only been carried out for relatively small proteins. To predict protein structure de novo for larger proteins will require better algorithms and larger computational resources like those afforded by either powerful supercomputers (such as Blue Gene or MDGRAPE-3) or distributed computing projects (such as Folding@home, Rosetta@home, the Human Proteome Folding Project, or Nutritious Rice for the World). Although computational barriers are vast, the potential benefits of structural genomics (by predicted or experimental methods) make de novo structure prediction an active research field.
[Please expand this set of references]
CASP:
Folding@Home:
HPF project:
Foldit: